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0. Background 

- NAS Parallel Benchmarks (NPB, 1991) 
http://www.nas.nasa.gov/Software/NPB 
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■ Lack in the area of irregular and dynamically 
changing memory access 


0 Motivation 

■ Do we care problems with irregular dynamical 
memory access? 

YES. 

■ Problems with localized error source benefit from adaptive 
nonuniform meshes 

■ Do we need this benchmark? 

YES. 

■ Certain machines perform pooriy on such problems 

■ Parallel implementation may provide further performance 
improvement but is difficult: 

• load balancing / data (re)distribution 

- data dependence 

- false and true data sharing 


Application Selection 


■ Representative of problem class relevant to scientific 
computing community 

■ Simple without sacrificing credibility and effectiveness 
• Stylized heat transfer problem 

- Can be load balanced for range of processor sets 
with little communication and remapping 

- Spectral Element Method (Patera) 

■ Have irregular, dynamic memory accesses feature. 

- Adaptive Nonconforming Mesh 


0 Heat Transfer Problem 

• Mathematical model 
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Diffusion 
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^ Spectral Element Method 

- High-order weighted residual technique 
which combines 

- Geometrical flexibility of finite element method 
■ High accuracy and rapid convergence of spectral 
method 

■ Variational form (GLL Quadrature) 
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fa Elemental Discrete Equations 
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0 Global Discrete Equations 
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Nonconforming mesh 


■ Why nonconforming: local area refinement 

■ What is nonconforming 
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- Problem raised by nonconforming mesh: 
Continuity across element boundary 


Mortar Element Method 

Introduces a new mortar trace space: 

■ Preserve local structure 

■ Decouple the local/global computation 

■ Efficient for parallel computation 

■ Degrees of freedom are located in 

- Element interior 

■ Mortar elements 





Mortar collocation points on 
nonconforming edges 


A 

/ 

A / 

j i 


i 2 


7 

/ 7 

/I A 


i 3 


5 I 7 



i y 


nonconforming edge 

mortar elements corresponding to / mortar elements corresponding to 
the top face of element 3 j the nght face of element 3 


collocation point on mortar elements 


fix Mortar Element Method 


Continuity across nonconforming 
fix element iterfaces 

■ Solution on elements 




C° continuity is replaced by two conditions: 



i. Vertex condition: the solution on an 
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element vertex equals to the solution at the 
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corresponding mortar point. 

■ Solution on mortars 


2 . L 2 condition: the solution difference 
between an element face and its related 
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mortar elements is minimized in an integral 
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Mapping for Nonconforming 
Faces 



mapping 


Q T For reverse mapping from element to mortar 


0 Discrete Equ ations 

conforming AT n * [ = BT n+l 

nonconforming Q t AQT,** 1 =Q r Bf,"* 1 

Where 9 refers to Global transformation matrix 
assembled using local transformation matrix Q 

• symmetrical 

• positive definite 

Solved by CG with a Diagonal Preconditioner 


Mesh Adaptatio n Procedure 

■ Perform adaptation every m time steps 

■ Refine elements close to high error region: 
elements have overlap with the heat source 

> Coarsen the grid elsewhere if possible 





Sample problem 

f* 


^ Initial & Boundary condition 



■ Initial grid [0,1] 3 



■ Initial temperature T=0 



■ Initial heat source location (0.30,0.28,0.28) 



■ Heat source strength P = 10 



■ Heat source movement / Velocity field 
v = (1,1,1) 
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■ Boundary condition: T=0 @ all faces 
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Problem parameters and Verification 

- 

f TdO. At the last time step 
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0. Current Status 

■ Benchmark Design: 

http ://www. nas. nasa . gov/Softwa re/N PB 

■ Pencil and Paper Specification: 2/2003 

■ Sequential implementation: 

- under construction 

- Parallel implementation: 

- Space filling curve to handle the load 
balance 
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